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Abstract.  We  study  the  solutions  of  Hermitian  positive  definite  Toeplitz 
systems  Ax  =  6  by  the  preconditioned  conjugate  gradient  method  for  three 
families  of  circulant  preconditioners  C.  The  convergence  rates  of  these 
iterative  methods  depend  on  the  spectrum  of  C~1A.  For  a  Toeplitz  matrix 
A  with  entries  which  are  Fourier  coefficients  of  a  positive  function  /  in  the 
Wiener  class,  we  establish  the  invertiblity  of  C,  and  that  the  spectrum  of 
the  preconditioned  matrix  C~1A  clusters  around  one.  We  prove  that  if  / 
is  (/  +  l)-times  differentiate,  with  /  >  0,  then  the  error  after  1q  conjugate 
gradient  steps  has  decreased  like  ((q  —  l)\)~21.  We  also  show  that  if  C  copies 
the  central  diagonals  of  A,  then  C  minimizes  \\C  —  A\\i  and  \\C  —  A\\oo. 
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1.  Introduction. 

In  this  paper  we  discuss  the  solutions  to  a  class  of  Hermitian  positive 
definite  Toeplitz  systems  Ax  —  b  by  the  preconditioned  conjugate  gradient 
method.  Direct  methods  that  are  based  on  the  Levinson  recursion  formula 
are  in  constant  use;  see  for  instance,  Levinson  [10]  and  Trench  [12].  For 
an  n  by  n  Toeplitz  matrix  An,  these  methods  require  0(n2)  operations. 
Faster  algorithms  that  require  0(n  log2  n)  operations  have  been  developed, 
see  Bitmead  and  Anderson  [l]  and  Brent,  Gustavson  and  Yun  [2].  The 
stability  properties  of  these  direct  methods  for  symmetric  positive  definite 
matrices  are  discussed  in  Bunch  [3]. 

In  [11],  Strang  proposed  using  preconditioned  conjugate  gradient  method 
with  circulant  preconditioners  for  solving  symmetric  positive  definite  Toe- 
plitz systems.  The  number  of  operations  per  iteration  is  of  order  O(nlogn) 
as  circulant  systems  can  be  solved  efficiently  by  the  Fast  Fourier  Transform. 
R.  Chan  and  Strang  [4]  then  considered  using  a  circulant  preconditioner  Cn 
that  is  obtained  by  copying  the  central  diagonals  of  An  and  bringing  them 
around  to  complete  the  circulant.  In  that  paper,  we  proved  that  if  the 
underlying  generating  function  /,  the  Fourier  coefficients  of  which  give  the 
entries  of  An,  is  a  positive  function  in  the  Wiener  class,  then  for  n  suffi- 
ciently large,  Cn  and  C"1  are  uniformly  bounded  in  the  l^  norm  and  that 
the  eigenvalues  of  the  preconditioned  matrix  C~xAn  cluster  around  1.  We 
note  that  /  is  an  even  function  as  An  are  symmetric. 

In  this  paper,  we  extend  these  results  to  Hermitian  positive  definite 
Toeplitz  systems.  More  precisely,  we  show  in  §2  that  if  the  generating 
function  /  is  a  real-valued  positive  function  in  the  Wiener  class,  then  the 
spectrum  of  C~lAn  is  clustered  around  1.  We  remark  that  the  proof  given 
in  R.  Chan  and  Strang  [4]  cannot  be  readily  generalized  to  cover  this  case. 
In  fact,  for  Hermitian  An,  the  Hankel  matrices  Hn/2  used  in  the  proof 
in  [4]  are  not  Hermitian,  and  the  Circulant- Toeplitz  eigenvalue  problem 
cannot  be  split  into  two  similar  Toeplitz-Hankel  eigenvalue  problems.  In 
§3,  we  establish  the  superlinear  convergence  rate  of  the  conjugate  gradient 
method  when  applied  to  these  preconditioned  systems.  In  particular,  we 
show  that  if  /  is  (/  +  l)-times  differentiable,  with  /  >  0,  then  the  error  after 
2q  conjugate  gradient  steps  has  decreased  like  ((q  —  l)!)-2'. 


Ln  §4,  we  discuss  other  viable  preconditioners  for  the  same  problem.  We 
show  that  the  preconditioned  systems  for  these  preconditioners  also  have 
clustered  spectra  around  1  for  large  n  and  that  they  all  have  the  same 
asymptotic  convergence  rate.  In  §5,  we  show  that  the  preconditioner  that 
copies  the  central  diagonals  of  An  is  optimal  in  the  sense  that  it  minimizes 
|| Fn  —  An\\i  =  \\F„  —  AnWoo  over  all  Hermitian  circulant  matrices  Fn.  Finally, 
numerical  results  are  given  in  §6. 

2.  The  Spectrum  of  the  Preconditioned  Matrix. 

Let  us  first  assume  that  the  Hermitian  Toeplitz  matrices  An  are  finite 
sections  of  a  fixed  singly  infinite  positive  definite  matrix  A,*,,  see  R.  Chan 
and  Strang  [4].  Thus  the  (*,j)-th  entries  of  An  and  A^  are  a^j,  with  a*  = 
a_i  for  all  k.  We  associate  with  A^  the  real-valued  generating  function 

/(*)  =  £>*«-", 

—  oo 

defined  on  |0,27r).  We  will  assume  that  /  is  a  positive  function  and  is  in 
the  Wiener  class,  i.e.  the  sequence  {ak}tL-oo  xs  m  'l*  ^  then  easily  follows 
that  An  are  Hermitian  positive  definite  matrices  for  all  n,  see  for  instance, 
Grenander-Szego  [8].  Moreover,  if 

0  <   /min   <   /  <   /max   <  OO, 

then  the  spectrum  o{An)  of  An  satisfies 

*(i4»)  £  [/nun. /«.]•  (1) 

Let  Cn  be  the  Hermitian  circulant  preconditioner  that  copies  the  central 
diagonals  of  An.  More  precisely,  the  entries  c,;  =  c,_y  of  Cn  are  given  by 


Ck 


a*       0  <  k  <  m, 
ak.n     m<k<n,  (2) 

c_k     0  <  -k  <  n. 


For  simplicity,  we  are  assuming  here  and  in  the  following  that  n  =  2m  +  1. 
The  case  where  n  —  2m  can  be  treated  similarly,  and  in  that  case,  we  define 
cm  =  (<*m  +  «-m)/2,  see  (17)  below. 


We  will  show  that  CnlAn  has  a  clustered  spectrum.  We  first  note  that 

Theorem  1.  Suppose  f  is  positive  and  is  in  the  Wiener  class.  Then  for 
large  n,  the  circulants  Cn  and  C~l  are  uniformly  bounded  in  the  l2  norm. 
In  fact,  for  large  n,  the  spectrum  o(Cn)  of  Cn  satisfies 

o{Cn)  C  [/„„„, /max].  (3) 


The  proof  of  this  Theorem  is  similar  to  the  proof  of  Theorem  1  of  R.  Chan 
and  Strang  [4],  and  we  therefore  omit  it. 

Next  we  show  that  An  —  Cn  has  a  clustered  spectrum: 

Theorem  2.  Let  f  be  a  positive  function  in  the  Wiener  class,  then  for 
all  e  >  0,  there  exist  M  and  N  >  0  such  that  for  all  n  >  N ,  at  most  M 
eigenvalues  of  Cn  —  An  have  absolute  values  exceeding  e. 

Proof:  Clearly  R„  =  Cn  —  An  is  a  Hermitian  Toeplitz  matrix  with  entries 
Uj  =  r,_,  given  by 

[  0  0  <  k  <  m, 

fit  =  <    ak-n  -  ak     m<  k  <  n,  (4) 

(        f.k  0  <  -k  <  n. 

Since  /  is  in  the  Wiener  class,  thus  for  all  given  e  >  0,  there  exists  an 
N  >  0,  such  that  X)ttw+i  |a*|  <  c-  Let  UJM  be  the  n  by  n  matrix  obtained 
from  Rn  by  replacing  the  (n  —  N)  by  (n  —  N)  principal  submatrix  of  Rn 
by  the  zero  matrix.  Then  rank(£/W)  <  2N.  Let  WJM  =  Rn  -  UnNl  The 
leading  (n  -  N)  by  (n  -  N)  block  of  WW  is  the  (n  -  N)  by  (n  -  N) 
principal  submatrix  of  Rn,  hence  it  is  Toeplitz,  and  it  is  easy  to  see  that 
the  maximum  column  sum  of  WJ^N'  is  attained  at  the  first  column  (or  the 
(n  —  N  —  l)-th  column).  Thus 

n-N-l  n-N-l  n-S-l 

WiN%=    £    W=     E    k-n-at|<     E    l«*l<«-  (5) 

t=m+l  t=m+l  k=N+l 

Since  WW  is  Hermitian,  we  have  ||WW||oo  =  ll^^lli-  Thus 

wnN%<mni-wiN)\u><e. 


Hence  the  spectrum  of  Wj^*  lies  in  (  —  f,c).  By  Cauchy  Interlace  Theorem, 
see  Wilkinson  [13],  we  see  that  at  most  2N  eigenvalues  of  Rn  =  Cn  —  An 
have  absolute  values  exceeding  e.      D 

Combining  Theorems  1  and  2,  and  using  the  fact  that 
C-1An  =  In  +  C-\An-Cn), 
we  have 

Corollary.  Let  f  be  a  positive  function  in  the  Wiener  class,  then  for  all 
e  >  0,  there  exist  N  and  M  >  0,  such  that  for  all  n  >  N ,  at  most  M 
eigenvalues  of  C"1  An  —  In  have  absolute  values  larger  than  e. 

Thus  the  spectrum  of  C~lAn  is  clustered  around  one  for  large  n. 

3.  Superlinear  Convergence  Rate 

It  follows  easily  from  the  Corollary  of  the  last  section  that  the  conjugate 
gradient  method,  when  applied  to  the  preconditioned  system  C~1An,  con- 
verges superlinearly.  More  precisely,  for  all  e  >  0,  there  exists  a  constant 
C(e)  >  0  such  that  the  error  vector  e,  at  the  q-th  iteration  satisfies 

||e,||  <  C(c)*|M|,  (6) 

where  ||x||2  =  x'Cn  7  ACn  7x,  see  R.  Chan  and  Strang  [4]  for  a  proof.  Thus 
the  number  of  iterations  to  achieve  a  fixed  accuracy  remains  bounded  as 
the  matrix  order  n  is  increased.  Since  each  iteration  requires  0{n  log  n) 
operations  using  the  Fast  Fourier  Transform,  see  Strang  [11],  the  work  of 
solving  the  equation  Anx  =  b  to  a  given  accuracy  6  is  c(f,6)n  log  n,  where 
c(f,  6)  is  a  constant  that  depends  on  /  and  6  only. 

We  note  that  if  extra  smoothness  conditions  are  imposed  on  /,  we  can 
get  a  more  precise  bound  on  the  convergence  rate: 


Theorem  3.  Let  f  be  a  (l  +  l)-times  differcntiable  function  with  its  (l+l)-th 
derivative  of  f  in  Ll[0t2ir),  /  >  0.   Then  for  large  n, 

K||  <  {{q*1)iyMW>  (?) 

for  some  constant  c  that  depends  on  f  and  I  only. 

Proof:  We  remark  that  from  the  standard  error  analysis  of  the  conjugate 
gradient  method,  we  have 

||e,||<[minmax|P,(A)|]||e0||,  (8) 

where  the  minimum  is  taken  over  polynomials  of  degree  q  with  constant 
term  1  and  the  maximum  is  taken  over  the  spectrum  of  C'1  An,  or  equiv- 

alently,  the  spectrum  of  Cn  3  AnCn  a,  see  for  instance,  Golub  and  van  Loan 
[7].  In  the  following,  we  will  try  to  estimate  that  minimum. 
We  first  note  that  the  assumptions  on  /  imply  that 


(9) 


where  c  =  ||/^+1^||li,  see,  for  instance,  Katznelson  [9].  Hence 

E  M<*  E  TMTi±*  k   ^rsf,   v*>i. 

As  in  Theorem  2,  we  write 

Rn  =  ww  +  Uwi   Vfc>l, 

where  Unk^  is  the  matrix  obtained  from  Rn  by  replacing  its  (n  —  k)  by 
(n  —  k)  principal  submatrix  of  R„  by  a  zero  matrix.  Using  the  arguments  in 
Theorem  2,  cf  (5)  and  (9),  we  see  that  rank(^)  <  2fc  and  \\W^%  <  c/k1, 
for  all  k  >  1.  Now  consider 
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By  Theorem  1,  we  have  rank (U^)  <  2k  for  large  n  and 


c 


mik)\\2<\\c-%\\ww\\2<-,  v*>i,  (io) 

With  C  =  c//min. 

Next  we  note  that  W^  —  lVf[*+1)  can  be  written  as  the  sum  of  two  rank 
one  matrices  of  the  form: 

WW  -  W^+V  =  ukv'k  +  vku'k  =  -(utftojf  -  wl wl*)y     VJfc  >  0. 

Here  ut  is  the  (n  —  fc)-th  unit  vector,  vt  =  (rn_t_!,-  •  •  ,r1,r0/2,0)-  •  •  ,0), 

with  Tj  given  by  (4),  and  wk  =  uk  ±  v*.  Hence  by  letting  zk  —  Cn  3wk  for 
k  >  0,  we  have 

=    wW  +  Vk+-Vk~,     Vfc>l,  (11) 

where  V*  =  \  S*=o  zfzf'  are  positive  semi-definite  matrices  of  rank  k.  Let 
us  order  the  eigenvalues  of  W^  as 

Mo  <  Mf  <  ■••  <  Mi"  <  Mo- 

By  applying  Cauchy  Interlace  Theorem  to  (11)  and  using  the  bound  of 
||Wl*'||i  in  (10),  we  see  that  for  all  k  >  1,  there  are  at  most  k  eigenvalues 
of  WJP'  lying  to  the  right  of  c/k1,  and  there  are  at  most  k  of  them  lying  to 
the  left  of  —c/k1.  More  precisely,  we  have 


Irf I  <  iWlll  <  ^.     Vfc>!' 


Using  the  identity 


_1  _A 

we  see  that  if  we  order  the  eigenvalues  of  C„  *  AnCn  *  as 


Ao  <A7  <      -<A+<  A+, 


then  A£  =  1  +  pf  for  all  A:  >  0  with 


l-^<At-<A+<l  +  ^,    V*>1.  (12) 

For  Aq,  the  bounds  are  obtained  from  (l)  and  (3): 

7^  <  A0-  <  A0+  <  &=.  (13) 

/max  Jmin 

Having  obtained  the  bounds  for  Aj,  we  can  now  construct  the  poly- 
nomial that  will  give  us  a  bound  for  (8).  Our  idea  is  to  choose  P2q  that 
annihilates  the  q  extreme  pairs  of  eigenvalues.  Thus  consider 

*(x)  =  (1  -  ^)(1  -  j|),    V*>1. 

Between  those  roots  Aj ,  the  maximum  of  |pt(x)|  is  attained  at  the  average 
x  =  |(AJ  +  A^ ),  where  by  (12),  we  have 

Similarly,  for  fc  =  0,  we  have,  by  using  (13), 

max      |p0(x)|  =    ^0    ~  A0")2    <    (/max  -   fLn)\ 

*elK*tl    *  "  4A0+A0-      -         4/Vn 

Hence  the  polynomial  P2g  =  popi  •  •  -p,-i,  which  annihilates  the  q  extreme 
pairs  of  eigenvalues,  satisfies 

|P*(*)|  <  {{g    '\V)2l,  (14) 

for  some  constant  c  that  depends  only  on  /  and  /.  This  holds  for  all  A*  in 
the  inner  interval  between  A~_x  and  A+_x,  where  the  remaining  eigenvalues 
are.  Equation  (7)  now  follows  directly  from  (8)  and  (14).       □ 
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4.  Other  Circulant  Preconditioned. 

The  proof  of  Theorem  2  suggests  that  there  are  many  other  viable  pre- 
conditioners  that  can  give  us  the  same  asymptotic  convergence  rate.  One 
example  is  given  by  the  circulant  matrix  Bn  proposed  by  T.  Chan  [6].  It  is 
obtained  by  averaging  the  corresponding  diagonals  of  An  with  the  diagonals 
of  An  being  extended  to  length  n  by  a  wrap-around.  More  precisely,  the 
entries  6j,  =  6,_,  of  Bn  are  given  by 


0  <  -k  <  n, 


where  an  is  taken  to  be  0.  He  proved  that  such  Bn  minimizes  the  Frobenius 
norm  ||Bn  —  A„\\f  over  all  possible  circulant  matrices  Bn.  The  entries 
r^  =  r,_y  of  Bn  -  An  are  given  by 

(k 
-(ok_„-ot)      0<k<n, 
n 
f-k  0  <  -k  <  n. 

As  in  Theorem  2,  we  let  Wj^'  to  be  the  matrix  obtained  from  Bn  —  An  by 
replacing  the  last  N  rows  and  N  columns  of  Bn  —  An  by  zero  vectors.  We 
see  that 

ll^)lli<2nE"1|rt|<2f;^at|  +  4    £     \ak\.  (15) 

t=0  k=0  n  k=N+\ 

Now  let  M  >  N  be  such  that  £E£=o*la*l  <  e-  Then  for  a11  n  >  M, 
we  have  ||^^'||i  <  6t.  Hence  the  eigenvalues  of  Bn  —  An  are  clustered 
around  zero,  except  for  at  most  2N  of  them.  We  remark  that  by  using 
results  in  R.  Chan  [5],  we  can  show  that  limn_00  \\Cn  —  BM||j  =  0  and  that 
the  convergence  rate  of  C~lAn  and  B~lAn  are  the  same  for  large  n.  In 
particular,  both  will  converge  superlinearly. 

As  another  example,  let  us  consider  the  circulant  matrix  D„  with  entries 
da  =  di.j  given  by 

d     =  I  a*~-  +  a*      °  ~  k  <  "' 
*    "  \        d.k         0  <  -k  <n, 
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where  an  is  again  taken  to  be  0.  The  entries  r,y  =  r<_;  of  Dn  —  An  are  given 
by 


_  f  ak-n      0<k<n, 
r*    "  \    f_k     0  <  -k  <  n. 


It  is  easily  seen  that  the  conclusion  of  Theorem  2  holds  for  this  precondi- 
tioner  too,  cf  (5)  and  (15).  Similar  to  the  case  of  Bn,  we  can  also  show 
that  lim„_oo  \\Cn  —  Dn\\2  =  0  and  that  the  convergence  rate  of  C~lAn  and 
D~lAn  are  the  same  for  large  n,  see  R.  Chan  [5].  Numerical  results  in  §6 
indeed  show  that  the  three  preconditioned  Bn,  Cn  and  Dn  behave  almost 
the  same  for  large  n. 

5.  The  Optimality  of  Cn. 

From  the  discussion  in  §§2  and  4,  we  know  that  it  is  interesting  to  obtain 
the  Hermitian  circulant  matrix  Fn  that  minimizes  the  norm  \\Fn  —  An\\i  = 
\\Fn  —  A, | |oo.  The  minimum  is  attained  by  Cn: 

Theorem  4.  The  circulant  matrix  Cn  whose  entries  are  given  by  (2)  min- 
imizes \\Fn  —  An\\i  =  \\Fn  —  ^4n||oo  over  all  possible  Hermitian  circulant 
matrices  Fn. 

Proof:  Let  us  construct  the  circulant  matrix  Gn  that  minimizes  the  column 
sums  of  Gn  —  An.  Let  the  (t,j)-th  entries  of  Gn  be  gy  =  <7,_;.  Since  Gn 
is  Hermitian  and  circulant,  we  have  gk  =  gn-k  for  k  =  l,...,m,  where 
m  =  (n  -  l)/2.  Hence  Gn  is  determined  by  {gic}?^-  For  j  =  0, . . .  ,n  -  1, 
the  j'-th  column  sum  s,  of  G„  —  An  is  given  by 

n-l-j  j 

si=    E    \ak-gk\  +  ^2\ak-gk\.  (16) 

t=o  *=i 

We  note  that  sn-\-j  =  Sj  for  0  <  j  <  n.  Hence  it  suffices  to  consider  s, 
for  0  <  j  <  m.  The  term  involving  g0  in  (16)  is  \oq  —  <7o|  which  has  its 
minimum  at  g0  =  ao.  For  k  —  1, ...,m,  the  terms  involving  gk  in  (16)  are 
either  of  the  form 

(a)     \ak  -  gk\  +  \ak  -  gk\  =  2\ak  -  gk\, 
or    (b)     \ak  -  gk\  +  \an.k  -  gn-k\  =  \ak  -  gk\  +  \an-k  -  gk\. 
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In  case  (a),  the  minimum  is  at  gk  =  ak.  In  case  (b),  the  minimum  occurs  at 
any  gk  lying  on  the  line  segment  joining  ak  and  an_4.  In  particular,  (a)  and 
(b)  attain  their  minima  at  gk  =  ak.  Thus  Gn  so  constructed  is  the  same  as 
the  Cn  given  by  (2). 

Now  for  any  other  circulant  matrix  En,  the  j-th  column  sum  t;  of  En  —  An 
will  satisfy  Sj  <  tj,  for  j  =  0, . . . ,  n  —  1.  Hence, 

\\Cn  -  An\\i  =  maxs,  <  max*,-  =  \\En  -  An\\v     D 


Remark:  When  n  =  2m  is  even,  gm  is  real  as  Gn  is  both  Hermitian  and 
circulant.  The  term  involving  gm  in  S,  takes  the  form  \am  —  gm\  or  \am  —  gm\. 
Since  s;  =  3n_i_y  for  j  =  0,  •  •  • ,  n  —  1,  we  see  that  gm  should  be  chosen  such 
that  both  terms  are  minimized,  i.e, 

9m  =  -{am  +  am).  (17) 


6.  Numerical  Results. 

To  test  the  convergence  rates  of  the  preconditioned,  we  have  applied 
the  preconditioned  conjugate  gradient  method  to  Anx  =  6  with 


ak 


2  )fc  =  0, 

a-k         k  <  0. 


The  underlying  generating  function  /  is  given  by 

sin(Jfc0)  +  cos(kB) 


/(*)  =  2£ 


t=o 


(1  +  it)11 


Clearly  /  is  in  the  Wiener  class.  The  spectra  of  An,  B~lAn,  C~lA„  and 
D~lAn  for  n  =  32  are  represented  in  Figure  1.  Table  1  shows  the  number 
of  iterations  required  to  make  ||r,||2/||r0||j  <  10"7,  where  rq  is  the  residual 

11 


vector  after  q  iterations.  The  right  hand  side  b  is  the  vector  of  all  ones 
and  the  zero  vector  is  our  initial  guess.  We  see  that  as  n  increases,  the 
number  of  iterations  increases  like  O(logn)  for  the  original  matrix  An, 
while  it  stays  almost  the  same  for  the  preconditioned  matrices.  Moreover, 
all  preconditioned  systems  converge  at  the  same  rate  for  large  n. 


n 

An 

B~lAn 

C~lAn 

D~lAn 

16 

13 

7 

8 

7 

32 

15 

6 

7 

6 

64 

18 

7 

7 

7 

128 

19 

7 

7 

7 

256 

21 

7 

7 

7 

Table  1.  Number  of  Iterations  for  Different  Systems 
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